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REMARKS 

The indication of claims 5, 10, and 12 containing allowable subject matter is noted. 

Claims 1 and 12 have been amended to overcome the rejection fliereof based on 
35 use 112. paragraph 2. 

Claim 8 has been cancelled and replaced by claim 13 to overcome the rejection 
thereof based on 35 USC 112, paragraph 2 and the rejection based on 35 USC 101. 

Amended claims 1-4, 6. 7 and 9 and new daim 13 are not rendered unpatentable 
under 35 USC 103(a) as a result of Van den Alcker (US 6,415,250) in view of De Campos 
(US 6,272,456), the combination of reference previously relied on to reject claims 1-4 and 
6-9. 

Applicant repeats the remarks filed on March 18, 2009, relating to the final office 
action, in response to rejection the examiner's arguments in pai^graphs 3 and 9 of the 
office action at page 3 relating to the rejection of claims 1-4 and 6-9. 

In addition, paragraph [0014] of applicant's published application Indicates the 
language identification system of Van den Al^ker is limited to a single category of first 
category strings, such as suffixes or prefixes or infixes, in a word (column 8, lines 5-12, 
figure 5A). Column 20, lines 36-43, of Van den Akker states prefixes and suffixes df vwrds 
can be extracted. This does not mean one or plural prefixes and one or plural suffixes are 
extracted from one word. Instead. It means one suffix or one prefix Is extracted from one 
word. i.e., one prefix can be extracted from a word and one suffix can be extracted from 
another word. Van den Akker, at column 8, line 63 to column 9, line 3, and in claims 1, 3, 
12 and 16 refers to suffixes and word endings. These portions of the references do not 
state that a combination of all the porttons of one word can be extracted. Therefore, Van 
den Akker does not include the requirements of amended independent daim 1 or new 
independent claim 13 to analyze words extracted from digital text, to thereby constmct fbr 
each extracted word all the character strings contained in said extracted word, induding all 
the prefixes, suffixes and Infixes in sakl extracted word, with overlap and different lengths 
lying between one character and the number of characters In said extraded word. Van 
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AKker analyzes only one character string per extracted word relative to the corpus of a 
language, inespective of the position of the character string in the word (and therefore, of 
the length of the character string if the character string is a suffix or prefix). 

The s^nlficanoe of the foregoing differences between applicant's claims 1 and 13 
and Van den Akker can be seen by comparing certain parts of the Van den Akker 
specification and applicant's spedficatton. 

In Van den Akker [column 11, line eS-column 12, line 18], a "suffix of the each of 
the parsed words 403' in the suffix extractor 404 "is detennined to have a piedetennined 
number of characters at the end of each parsed word," and "the predetemiined number of 
characters in the parsed words 403 extracted by the suffix extractor 404 is three 
characters," and "the source language of an unknown text 301" is Identified "by analyzing 
the last three characters of the parsed words 403." and "for example, four characters at the 
end of the word may be used to capture the suffix." In the same manner, the 
predetermined number of charaders, three or four, defines the length of a preffx in the 
parsed word, as indicated by column 8, line 58 to column 9, line 3, which states: "wond 
portions that contains the prefix may be used "; Fig. 2C. Obviously, when the parsed word 
contains fewer than the predetemiined number of characters, for example equal to three, 
the extracted suffix or the extracted word portion with three or two or one characters is the 
word itself (column 12, lines 27-32). 

In De Campos (column 10, lines 46-65), as a three-letter windows is slid over a 
particular training document as "ABCABCKLM", the "ABC" letter sequence wouM appear 
twice and the "KLM" letter sequence would appear only once. However, De Campos has 
no dtsctosure of all the other letter sequences having more or less than three letterB and 
overlapping, such as "BCABC", "CABCKL", "ABCABC" and "AB." 

Paragraph [0032] of applicant's published applicatk>n states, "Ihe first three 
directories PRq. SUq and INq relate to morphemes, syllables and short character strings 
CH of from one to six characters, for example' and more particularly relate respectively to 
prefixes, suffixes and infixes in the predetermined languages. An infix Is defined as a 
character string that is included between the first character. I.e., the start, of a word, and 
the last character, l.e., the end of the word, as e.g. "ou" or "oi» [paragraph 0036 of 
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applicant's published application]. In other examples, such as found in Van den Akker, at 
column 8, lines 53-57 (see also figures 2B-2C}, applicant's directories include the suffix 
"ing" of the word "smashing", but also includes the Infixes "mash* and "ash" of the word 
"smashing." The applicant's directories also Include the suffix "Vnenf of the word 
"development," and the infixes "veto", "ve", and "lo" of the word "developmenf . 

As a result of the above remarits, the word analyzing in the claimed device of claim 
1 and method of claim 13 constructs aH the character strings contained in an extracted 
word and having lengths lying between one character and the number of characters in 
said extracted word, such as the prefix "de", the suffix "menf, and the other character 
chains "eve", "evelo", "lopm", "opme". etc.. having infixes "velo". "ve". and "lo". Included in 
the extracted word "developmenf and thus including the partially overiapping chains as 
stated at paragraph [0057] of applicant's published application by. "In the latter variant, the 
character strings CH corrtained in the extracted word MT and found in the directories PRq, 
SUq and INq may partially overlap, in contrast to the n-grams of the approach disclosed in 
US patent No. 6,292,772 B1 already commented on. For example, if the processed word 
MT is the French word "aiment", the character strings "menf and "enf placed in the 
pseudo-suffix directory SUq overiap in the processed word. Another example is the 
overiap of infix "oi" and the pseudo-suffix "is" of the processed word "vois." 

Therefore, applicant repeats his position that Van den Akker fails to disclose the 
function of the claimed analyzing means or step and more particulariy the means for or 
step of analyzing words extracted from said digital text to thereby construct for each 
extracted word all the character strings, as set forth in paragraph 0049 of applicant's 
published application which indicates analyzer 6 analyzes each extracted word MT to 
construct aH the character strings CH included in the extracted word MT, including all the 
prefixes and suffixes in the extracted word, as well as the character chains included 
between the start and end of the extracted word and overiapping (e.g. infixes and pseudo- 
infixes), and having lengths lying between one character and the number of characters in 
saki extracted word. 

Van den Akker also fails to disclose the requirements of claims 1 and 13 relating to 
(1) the pre-stored first character strings. Including all the prefixes and suffixes of words of a 
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plurality of predetermined languages and character chains included between the start and 
end of said words and overiapping (rndudrng Infixes and pseudo-infix infixes), that occur 
frequently anywhere respectively in said words of said predetennlned languages, and (2) 
the pre-slored second character strings of different lengths (including prefixes and 
suffixes) in said words of said plurality of predetemnined languages and character chains 
included between the start and end of said words and overiapping (e.g. infixes and 
pseudo-infix infixes), that are atypical anywhere respectively in words of said 
predeteimined languages. 

Applicant also disagrees with the contention in the office action that Van den Akker 
bases the score of a character chain extracted from a parsed word on the position of the 
first character chain in the extracted word of a parsed word. For example, the first 
character chain of a Van den Akker parsed word is a suffix or a prefix and always has the 
same position In the extracted words, i.e.. the last position in all the parsed words, ended 
by said sufTix, counted from the last character in the parsed words (or for a prefix, the first 
position in all flie parsed words starting with said prefix, counted from the first character in 
the parsed words). 

The position of a character chain in a parsed word is not disclosed by Van den 
Akker in the paragraph in column 9. lines 18^1, cited in the office action. This relied on 
portion of Van den Akker only relates to the frequency of a character chain In words of a 
language corpus. A selected word portion (character chain), such as a suffix, also called 
a Vord ending" (or a prefix) "was extracted from the corresponding language corpus 309 
along with a frequency value indicative of the number of times the selected word portion 
was found within the corresponding language corpus 309. These frequencies are 
subsequently nomialized to a common corpus size, sen/ing as relative frequencies of each 
of the selected word portion In each language; column 9. lines 34-41 and the frequency 
lists in column 13. lines 33^2, The frequency value is Indicative of the number of times 
the selected word portion was found within the corpus and does not lake into 
cons,deration the position of the character sthng in the parked word. For example, one 
suffix and one infix having the same frequency value can have drfferent positions, and two 
suffixes or prefoces having the same length can have dtfferent frequencies. 
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Applicant's claims 1 and 13 indicate the position of a character chain In an 
extracted word defines a first coefficient This limitation dlffera from (1) the frequency 
value of a selected word portion disclosed by Van den ARker and (2) the frequency of a 
character chain in an extracted word, as defined in applicant's claims 3 and 5, and 
disclosed by paragraph [00591 of applicant's published application. For example, the 
coefficients PO. FR and LON respectively depend on (1) the position of the character 
string CH in the extracted word MT, (2) the frequency of the character string CH in Itie 
determined language, Lq, and (3) the length of the character string CH. For example In 
the extracted words "developmenT. "abutmenf and '^ovemmenf . the suffix "menf always 
has the last position. 

Van den Akker does not consider any distinction relating to the position of the 
character chain "ment" in these three words. In contrast, applicant assigns three different 
coefRcients depending on the positions 8, 4, 3, 6 and 6 of the character chain "ment; 
counted from the beginning of the words -development", "alimented", "lamentable", 
-Incrementar and "rudimental." or the positions 4, 6. 8, 6 and 6 of the character chain 
"menr counted from the ends in the five words respectively. In a corpus including the 
words "development", "abutmenf, "govemmenf , "alimented", "lamentable", "incrementar 
and -rudlmental'', the frequency of the character chain "menf is seven and difierent fiom 
the values of tile positions. 

Therefore. Van den Akker falls to include the requirements of claims 1 and 13 for a 
score that is calculated by adding to the score a first coefficient whenever a prestored firet 
character string of said one detemiined language is found in said extracted word, wherein 
said first coefficient depends on the positron of the found prestored first character string of 
said one determined language In said extracted word. 

Van den Akker. at column 13. line 62 to column 14. line 24. effectively discloses 
that the probability, i.e., the nomrialized frequency value, of a word portion is negative to 
indicate a strong likelihood that the language of the Input text Is not the corresponding 
language. However, the Van den Akker wofxl portion can be a suffix, while a second 
coefficient in claims 1 and 13 is associated with a found prestored second character string 
which can be a prefix, a suffix or any wonJ chain included betvveen the start and end of 
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words of the pneddtermined languages and overlapping (including infixes and pseudo-Infix 
infixes), as disclosed by 0042-0044 of applicants published application. 

Because of the foregoing differences between claims 1 and 13 and Van den Akker, 
that are not considered in the office action, tfiere Is no need to discuss de Campos that the 
examiner alleges concems the second coefficient associated with a second character 
string that depends on a predetermined language. This is particularly the case because 
neither Van den Akker nor de Campos discbses the requirements of claims 1 and 13 
relating to a first coefficient that depends on the position of a character string found In an 
extracted word. 

Because claims 1 and 13 are allowable over Van den Akker in view of de Campos, 
dependent claims 2-4, 6. 7 and 9 are also albwable. 

To the extent necessary, a petWon for an extension of time under 37 C.F.R. 1.136 
is hereby made. Please charge any shortage in fees due in connection with the filing of 
this paper, including extension of time fees, to Deposit Account 07-1337 and please credit 
any excess fees to such deposit account. 

Respectfully submitted, 
LOWE HAUPTMAN HAM & BERNER, LLP 



1700 Diagonal Road, Suite 300 

Alexandria, Virginia 22314 

(703)684-1111 

(703) 518-5499 Facsimile 

Date: September 22, 2009 

AlUIL/qf 




Allan M. Lowe 
Registratiori No. 19,641 
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